Issues in preprocessing current datasets for grammatical error correction

نویسندگان

  • Christopher Bryant
  • Mariano Felice
چکیده

In this report, we describe some of the issues encountered when preprocessing two of the largest datasets for Grammatical Error Correction (GEC); namely the public FCE corpus and NUCLE (along with associated CoNLL test sets). In particular, we show that it is not straightforward to convert character level annotations to token level annotations and that sentence segmentation is more complex when annotations change sentence boundaries. These become even more complicated when multiple annotators are involved. We subsequently describe how we handle such cases and consider the pros and cons of different methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Impact of Immediate Grammatical Error Correction on Senior English Majors’ Accuracy at Hebron University

This study aimed at investigating the effects of grammatical error correction on EFL learners’ accuracy. Twenty-two male and female senior students were chosen randomly to respond to a questionnaire investigating their beliefs about immediate grammatical error correction.  Actually, the study was conducted in order to answer this question: what is the effect of grammatical error feedback on stu...

متن کامل

The Impact of Immediate Grammatical Error Correction on Senior English Majors’ Accuracy at Hebron University

This study aimed at investigating the effects of grammatical error correction on EFL learners’ accuracy. Twenty-two male and female senior students were chosen randomly to respond to a questionnaire investigating their beliefs about immediate grammatical error correction.  Actually, the study was conducted in order to answer this question: what is the effect of grammatical error feedback on stu...

متن کامل

Grammatical Error Correction of English as Foreign Language Learners

This study aimed to discover the insight of error correction by implementing two correction systems on three Iranian university students. The three students were invited to write four in-class essays throughout the semester, in which their verb errors and individual-selected errors were corrected using the Code Correction System and the Individual Correction System. At the end of the study, the...

متن کامل

The Effect of Focused Corrective Feedback and Attitude on Grammatical Accuracy: A Study of Iranian EFL Learners

Abstract The study aimed at investigating the efficacy of written corrective feedback (CF) in improving Iranian EFL learners’ grammatical accuracy. It compared the effects of focused and unfocused written CF on the learners’ grammatical accuracy. 75 EFL students formed a one control and two experimental groups. The focused feedback group was provided with error correction in tenses. The unfocus...

متن کامل

The Effect of Focused Corrective Feedback and Attitude on Grammatical Accuracy: A Study of Iranian EFL Learners

Abstract The study aimed at investigating the efficacy of written corrective feedback (CF) in improving Iranian EFL learners’ grammatical accuracy. It compared the effects of focused and unfocused written CF on the learners’ grammatical accuracy. 75 EFL students formed a one control and two experimental groups. The focused feedback group was provided with error correction in tenses. The unfocus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016